Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding OTLP Exporter #699

Merged
merged 13 commits into from
Aug 17, 2020
Merged

Adding OTLP Exporter #699

merged 13 commits into from
Aug 17, 2020

Conversation

codeboten
Copy link
Contributor

This PR adds configuration and retry behaviour specification for the OTLP exporter. Most of the configuration options are based off of the opentelemetry-collector configuration and the environment variables proposed are based on #666

Fixes #681

@codeboten codeboten requested review from a team July 13, 2020 21:50
@carlosalberto
Copy link
Contributor

This a good start!

@tsloughter
Copy link
Member

Since OTLP supports 3 protocols, grpc, http/json and http/protobufs, it'd be good if there was a configuration option to specify which protocol to use. It can probably just be protocol instead of splitting it out into transport and content_type?

@codeboten
Copy link
Contributor Author

Since OTLP supports 3 protocols, grpc, http/json and http/protobufs, it'd be good if there was a configuration option to specify which protocol to use. It can probably just be protocol instead of splitting it out into transport and content_type?

Good point @tsloughter, added an option for protocol.

@codeboten codeboten requested a review from a team July 20, 2020 20:31
@cijothomas
Copy link
Member

Since OTLP supports 3 protocols, grpc, http/json and http/protobufs, it'd be good if there was a configuration option to specify which protocol to use. It can probably just be protocol instead of splitting it out into transport and content_type?

Good point @tsloughter, added an option for protocol.

Seems like this change got lost.

@codeboten
Copy link
Contributor Author

Seems like this change got lost.

It's back now.

@carlosalberto carlosalberto added spec:trace Related to the specification/trace directory area:sdk Related to the SDK labels Jul 21, 2020
@nadiaciobanu
Copy link

nadiaciobanu commented Jul 22, 2020

Hello! I'm working on the OTLP exporter for OpenTelemetry C++. I was wondering if there's any ETA on the new specification? I'd like to begin updating the C++ OTLP exporter according to new specification soon. @codeboten

@codeboten
Copy link
Contributor Author

I believe I've addressed all the comments @nadiaciobanu. It would be great if the @open-telemetry/specs-approvers or @open-telemetry/specs-trace-approvers can take another look.

specification/trace/sdk_exporters/otlp.md Outdated Show resolved Hide resolved
specification/trace/sdk_exporters/otlp.md Outdated Show resolved Hide resolved
specification/trace/sdk_exporters/otlp.md Outdated Show resolved Hide resolved
specification/trace/sdk_exporters/otlp.md Outdated Show resolved Hide resolved
specification/trace/sdk_exporters/otlp.md Show resolved Hide resolved
specification/trace/sdk_exporters/otlp.md Outdated Show resolved Hide resolved

| Configuration Option | Description | Default | Env variable |
| -------------------- | ------------------------------------------------------------ | ----------------- | ------------------------------------------------------------ |
| Endpoint | Target to which the exporter is going to send spans or metrics. This MAY be configured to include a path (e.g. `example.com/v1/traces`). | `localhost:55680` | `OTEL_EXPORTER_OTLP_SPAN_ENDPOINT` `OTEL_EXPORTER_OTLP_METRIC_ENDPOINT` |
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not completely clear what happens if the Endpoint does not include the path. Is the default path automatically appended? So if I specify Endpoint=example.com does it mean the actual endpoint is example.con/v1/trace? What if I want the path to be the root and don't want the default path to be appended?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My expectation here is that the value configured is taken verbatim and passed to the OTLP exporter in the different languages. I would not expect a path to be appended. This does raise a good question of whether a default value for http and another for grpc makes sense here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure I like this approach. We are basically not providing any easy way to use the default port and path (for http transport).

The default path for OTLP/HTTP is defined in the spec: https://github.com/open-telemetry/oteps/blob/master/text/0099-otlp-http.md#request
I think we need to honor it. It is also important to honor the default port if the user does not specify it.

I believe we need to design the env variables such that configuration is easy in the most common case and for the uncommon cases it is OK if it is a bit more difficult.

The most common case is when I need to specify only the ip address or the hostname of the server and choose the transport. Everything else I expect to be set to the correct defaults. The defaults include the port number and when using otlp/http also the path.

For example I would expect the following to result in the destination of https://example.com:55680/v1/traces:

OTEL_EXPORTER_OTLP_SPAN_ENDPOINT=example.com
OTEL_EXPORTER_OTLP_SPAN_TRANSPORT=http

If I wanted to override the path I would probably be OK if I need to specify the full URL using a different env variable and in that case I don't need to specify the ENDPOINT and TRANSPORT:

OTEL_EXPORTER_OTLP_SPAN_ENDPOINT_URL=http://example.com:1234/mytraces

There may be other ways to structure this but ultimately I want to make sure the most common case is easy and with the current PR I think we are not achieving that goal.

Perhaps it is not possible to achieve what I ask for, so feel free to ignore this comment.

@@ -0,0 +1,43 @@
# OpenTelemetry Protocol Collector Exporter
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this a specification for a Collector or for any code that implement OTLP protocol client portion? Why do we refer specifically to the Collector here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is specifically around the OTLP exporter, which the go/js/python implementations refer to as the collector exporter.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is a bit misleading since OTLP is not used by the Collector only and Collector does not use OTLP only. It just happens to be that Collector can also accept OTLP. I think it is better to delete the references to the "Collector" here and in OTLP implementations in the SDKs.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair enough. Removed the references to the collector from this PR.

@@ -0,0 +1,43 @@
# OpenTelemetry Protocol Collector Exporter

This document specifies the configuration options available to the OpenTelemetry Protocol ([OTLP](https://github.com/open-telemetry/oteps/blob/master/text/0035-opentelemetry-protocol.md)) [Collector](https://github.com/open-telemetry/opentelemetry-collector) `SpanExporter` and `MetricsExporter` as well as the retry behavior.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure what does SpanExporter and MetricsExporter refer to in the context of Collector. Just to clarify, there are currently no such concepts in the Collector. Is this a proposal to add these concepts to the Collector?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is an artifact of some SDK implementations providing these separately, and because the processing & export pipelines for metrics and spans in the SDKs are configured separately.

The Go SDK provides a single OTLP exporter that implements both the trace.SpanBatcher and metric.Exporter interfaces, and it would be ideal IMO to be able to configure a single instance/connection.

The Java implementation implements the exporters separately and I think they use separate connections.

Most other SDKs have only implemented trace export for OTLP so far.

Given that the protocol is designed to multiplex different types of telemetry, I think it'd be good if the spec encouraged implementations to do so over a single connection. This would also simplify configuration, since we wouldn't need to duplicate all the environment variables for *_SPAN_* and *_METRIC_*.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While I agree that it would greatly simplify configuration, we would also be losing the flexibility of sending different types of telemetry data (traces/logs/metrics) to different backends by combining the configuration into a single one.

Though I suppose a user would still be able to accomplish this in code (by instantiating separate exporters for instance), they just wouldn't be able to do this via environment variable configuration.

| Certificate File | Certificate file for TLS credentials of gRPC client. Should only be used if `insecure` is set to `false`. | n/a | `OTEL_EXPORTER_OTLP_SPAN_CERTIFICATE` `OTEL_EXPORTER_OTLP_METRIC_CERTIFICATE` |
| Headers | The headers associated with gRPC or HTTP requests. | n/a | `OTEL_EXPORTER_OTLP_SPAN_HEADERS` `OTEL_EXPORTER_OTLP_METRIC_HEADERS` |
| Compression | Compression key for supported compression types. Supported compression: `gzip`| no compression | `OTEL_EXPORTER_OTLP_SPAN_COMPRESSION` `OTEL_EXPORTER_OTLP_METRIC_COMPRESSION` |
| Timeout | Max waiting time for the backend to process each spans or metrics batch. | 60s | `OTEL_EXPORTER_OTLP_SPAN_TIMEOUT` `OTEL_EXPORTER_OTLP_METRIC_TIMEOUT` |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some other configuration options are exposed by some exporter implementations. I'm not sure if these are generalizable to implementation in all SDKs, but they're worth considering (if they were previously considered and rejected in this PR, please ignore this comment):

Go also allows configuration of gRPC, which includes the retry limit (amongst other things).

@fbogsany
Copy link
Contributor

fbogsany commented Aug 4, 2020

This PR should also provide clarity about default ports for different protocols. None of the OTEPs for OTLP say anything concrete about default ports. Different SDKs have made incompatible choices.

Python and C++ seem to be the outliers here.

There is also contention in the collector about whether the ports for gRPC and HTTP receivers should be split or unified.

@codeboten
Copy link
Contributor Author

codeboten commented Aug 4, 2020

* [Python](https://github.com/open-telemetry/opentelemetry-python/blob/3143a4ba16af096584bf1bffbe2d673a87fecc34/exporter/opentelemetry-exporter-otlp/src/opentelemetry/exporter/otlp/trace_exporter/__init__.py#L92) uses `localhost:55678` as the default endpoint (which is actually the default for OpenCensus)

Good catch for the discrepancy in ports, I suspect it's an artifact of the opencensus receiver having been implemented first, I've created issues to update this here: open-telemetry/opentelemetry-python#968 and open-telemetry/opentelemetry-cpp#239

@fbogsany
Copy link
Contributor

fbogsany commented Aug 4, 2020

It would also be valuable to specify metrics here. When monitoring a tracing pipeline in production, it is useful to have consistent metrics across different SDKs. We have open-telemetry/opentelemetry-ruby#309 open in the Ruby SDK, which suggests:

At least the following metrics are useful.

In exporters:

  • export.failure - counts failed export requests
  • export.success - counts successful export requests
  • export.spans - counts spans successfully exported

In BatchSpanProcessor and SimpleSpanProcessor:

  • dropped_spans - counts spans permanently dropped

@carlosalberto
Copy link
Contributor

It would also be valuable to specify metrics here.

As this PR and the discussion has grown rather large, I suggest working on them on follow up issues/PRs to merge this ASAP. So let's try to focus on having the minimum, correct set of options that an OTLP exporter MUST define, and iterate over the nice-to-have things (and also more exotic features that may be supported depending on the language).

@codeboten codeboten requested a review from a team August 7, 2020 16:26
@codeboten
Copy link
Contributor Author

codeboten commented Aug 7, 2020

@andrewhsu sounds good. i'll get the CI passing.

@carlosalberto
Copy link
Contributor

@bogdandrutu Please review this, as it looks ready to be merged.

@fbogsany
Copy link
Contributor

fbogsany commented Aug 8, 2020

I'm concerned about the implicit expectation that Span and Metric exporters are configured separately. OTLP is designed to support multiplexing different forms of telemetry (metrics, traces, logs) over a single connection. The Go implementation, at least, encourages configuring it in that way. Doubling (tripling, when we add logs) the configuration variables by requiring them to be specified separately introduces unnecessary complexity into both exporter code (if we continue to support multiplexing, which we should) and service configuration.

The default deployment we should expect, when OTLP is used for both metrics and spans, is to send them both to the same destination (the OTel collector). If someone chooses to use, say, Prometheus for metrics and OTLP for spans, then the configuration of OTLP export should not change - all that changes is that it is not hooked into the metric export pipeline. I would expect OTLP export of metrics and spans to different destinations to be the exception rather the common case.

@tigrannajaryan
Copy link
Member

I'm concerned about the implicit expectation that Span and Metric exporters are configured separately. OTLP is designed to support multiplexing different forms of telemetry (metrics, traces, logs) over a single connection. The Go implementation, at least, encourages configuring it in that way. Doubling (tripling, when we add logs) the configuration variables by requiring them to be specified separately introduces unnecessary complexity into both exporter code (if we continue to support multiplexing, which we should) and service configuration.

The default deployment we should expect, when OTLP is used for both metrics and spans, is to send them both to the same destination (the OTel collector). If someone chooses to use, say, Prometheus for metrics and OTLP for spans, then the configuration of OTLP export should not change - all that changes is that it is not hooked into the metric export pipeline. I would expect OTLP export of metrics and spans to different destinations to be the exception rather the common case.

I agree with @fbogsany and I had the same concern.

From the number of upvotes I see this is a shared concern.

At the risk of repeating what Francis said: my expectation would be that it is should be possible to configure a single ENDPOINT env variable to which then all telemetry is sent (traces, metrics and in the future logs as well). This is very likely going to be the most common use case, with the ENDPOINT likely pointing to the Collector. As a consequence all other options, such as compression, transport, certificate, etc are also likely going to be the same for all signal types.

Requireng that each of the signals repeats all this env variables makes the most common use case complicated to use. We should aim to make this most frequent case easy.

I understand that sometimes (likely rarely) we may want to send different signals to different endpoints or use different options. This is rare and it is ok if it is more cumbersome to configure (e.g. with additional env variables per signal which override the common ones).

For the sake of making progress I am happy if this PR gets merged as is provided that we create an issue to address this concern before GA. @codeboten if you are OK with this approach please create the issue.

@codeboten
Copy link
Contributor Author

For the sake of making progress I am happy if this PR gets merged as is provided that we create an issue to address this concern before GA. @codeboten if you are OK with this approach please create the issue.

@tigrannajaryan I've opened follow up issues regarding the OTLP exporter, happy to tackle #790 right away. @fbogsany can you open an issue to address the capturing metrics in the span processors?

@carlosalberto
Copy link
Contributor

I understand that sometimes (likely rarely) we may want to send different signals to different endpoints or use different options. This is rare and it is ok if it is more cumbersome to configure (e.g. with additional env variables per signal which override the common ones).

This sounds like a good solution IMHO.

@tigrannajaryan
Copy link
Member

@tigrannajaryan I've opened follow up issues regarding the OTLP exporter, happy to tackle #790 right away.

@codeboten Thank you. Up to you if you want to do it in this PR or in a follow up PR.

@carlosalberto
Copy link
Contributor

Hey @fbogsany sounds ok to merge this one and work on the remaining issues in the follow up?

@tigrannajaryan Please approve ;)

Copy link
Member

@tigrannajaryan tigrannajaryan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, with the expectation that #790 is coming soon.

@fbogsany
Copy link
Contributor

I 👍 earlier comments, but that may not have been visible enough. This LGTM. I'll start a PR next week for the span processor metrics.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:sdk Related to the SDK spec:trace Related to the specification/trace directory
Projects
None yet
Development

Successfully merging this pull request may close these issues.

OTLP Exporter Spec
9 participants